Biostatistics For Dummies (Monika Wahi John Pezzullo)

diagnosis, start of therapy, end of therapy, start of improvement or remission, date of relapse, or

others. For events, you should record date of each event if it recurs, and even if death is not the

event of interest, date of death should be recorded if available. For censoring purposes, ensure

that you are collecting dates of contact so you can identify a last-seen date if needed. If you

collect your data properly, you will later be able to calculate any time interval needed, as well as

create an event status indicator needed.

Dates and times should be recorded to suitable precision. If your study timeline is years, it’s best to

keep track of dates to the day. In a Phase I clinical trial (see Chapter 5), participants may be studied

for events that happen in a span of a few days. In those cases, it’s important to record dates and times

to the nearest hour or minute. You can even envision laboratory studies of intracellular events where

time would have to be recorded with millisecond — or even microsecond — precision!

Dates and times can be stored in different ways in different statistical software (as well as

Microsoft Excel). Designating columns as being in date format or time format can allow you to

perform calendar arithmetic, allowing you to obtain time intervals by subtracting one date from

another.

Miscoding censoring information

It can be surprisingly easy to miscode the event status indicator. If the name of the variable is Death,

and is coded as 1 if the participant died during the observation period and 0 if they were censored, this

seems intuitive. But analysts may want to identify all the censored observations in their data, so they

may create a censored indicator named Censored, and code it as 1 if the participant is censored, and 0

if they are not. Because data may be used for different types of survival analyses, there could be other

event indicators included in the data as well also coded as 1 and 0.

The problem is that if you accidentally use your censored indicator instead of your event indicator

when running your survival analysis, you will unknowingly flip your analysis, and you won’t get any

warning or error message from the program. You’ll only get incorrect results. Worse, depending on

how many censored and uncensored observations you have, the survival curve may also not hint at any

errors. It may look like a perfectly reasonable survival curve for your data, even though it’s completely

wrong.

You have to read your software’s documentation carefully to make sure you code your event

variable correctly. Also, you should always check the program’s output for the number of

censored and uncensored observations and compare them to the known count of censored and

uncensored participants in your data file.